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Abstract 

Genetic algorithms (GAs) that solve hard problems quickly, reliably and 
accurately are called competent GAs. When the fitness landscape of a prob- 
lem changes overtime, the problem is called non-stationary, dynamic or 
time-variant problem. This paper investigates the use of competent GAs for 
optimizing non-stationary optimization problems. More specifically, we use 
an information theoretic approach based on the minimum description length 
principle to adaptively identify regularities and substructures that can be ex- 
ploited to respond quickly to changes in the environment. We also develop 
a special type of problems with bounded difficulties to test non-stationary 
optimization problems. The results provide new insights into non-stationary 
optimization problems and show that a search algorithm which automatically 
identifies and exploits possible decompositions is more robust and responds 
quickly to changes than a simple genetic algorithm. 



1 Introduction 

Real-world problems are rarely static. Problems change overtime, a factor 
compounded by the fact that environments under which they function are also in a 
constant state of flux. Although significant advances have been made in the devel- 
opment and design of genetic and evolutionary algorithms [OElHIl], only a few 
have accounted for the changing nature of the problems themselves [5|. Resolving 
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problems from scratch every time a change occurs is neither practical nor feasible, 
and is tantamount to re-inventing the wheel every time a problem with the wheel 
occurs. This aspect of problem-solving is especially pertinent where change is so 
frequent that re-solving the original problem can never be appropriate. 

We hypothesize that to solve non-stationary problems efficiently, previously 
encountered solutions can be used to extract structural knowledge about the prob- 
lem in hand. Identifying important regularities and sub-structures in a problem can 
help in responding quickly and tracking optima when the environment changes. A 
class of evolutionary algorithms that automatically discover the problem decompo- 
sition is known as competent genetic algorithms |6l |7l [HI l2l UHl HH E21 HSl [141 . In 
essence, competent genetic algorithms automatically and adaptively identify im- 
portant sub-structures of an underlying search problem and use them to efficiently 
explore the search space. The aim of this paper is to explore the advantages of 
using a candidate of these methods to examine our hypothesis. 

More specifically, we use the extended compact genetic algorithm (ecGA) ITOl 
as a candidate of probabilistic model-building GAs. In these types of GAs, the 
variation operators are replaced by building and sampling a probabilistic model 
of promising solutions. In ecGA, the probabilistic model is based on the infor- 
mation theoretic measure known as the minimum description length principle fT31 
[l"6l [TTIl . The structure and the probabilities of the decomposition model is ma- 
nipulated when the environment changes to speed-up the response of the solver to 
the changes. Similar to other studies on using genetic and evolutionary algorithms 
on non-stationary problems, we assume that the new solutions are related to the 
old ones and that the changes are bounded. Specifically, we incorporate bounded 
changes to both the problem structure and the fitness landscape. It should be noted 
that if the environment changes either unboundedly or randomly, on average no 
method will outperform restarting the solver from scratch every time a change oc- 
curs. 

The structure of the paper is as follows: in the next section, we will present a 
brief review to the background materials relevant to this paper. We will then review 
ecGA followed by the different methods we use for dynamic optimization in this 
paper. We then present the experimental setup, results, and discussions. 
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2 Background Materials 

In this section, we present a brief overview to previous work on evolutionary 
computation methods for dynamic environments and Adaptive automatic decom- 
position approaches. 

2.1 Dynamic Environments 

To date, there have been three main evolutionary approaches to solve optimiza- 
tion problems in changing environments. These approaches are: diversity control, 
memory-based, and multi-population methods. We will present a brief overview 
to this literature here and refer the reader to [ 5 ] for a more detailed review to this 
large growing field. 

Diversity has been a focal point of many recent work in enhancing the adap- 
tiveness of evolutionary methods for dynamic optimization problems. Diversity 
is controlled in two ways; either by increasing the diversity whenever a change 
is detected or maintaining high diversity all over the evolutionary run. Examples 
of the former include the hyper-mutation method [18|, the variable local search 
technique [ 19 1 and other methods in [20| and [21 1. The main methods in the latter 
group include Redundancy I22ll23ll24ll25ll26ll27l . random immigrants [28], Ag- 
ing [29], and the Thermodynamical Genetic Algorithms I301I3T1 . 

Memory-based approaches attract much attention in the literature. Two main 
types exist, implicit and explicit memories. In implicit memory I23ll32ll33l . a re- 
dundant representation is used as a means for memory. In explicit memories 13T1 . 
specific information, which may include solutions, get stored and retrieved when 
needed by the evolutionary mechanism. 

The third class of approaches depends on speciation and multi-populations. 
Sub-populations are maintained and each becomes specialized on a part of the 
search space. This facilitates the process of tracking the optima as they move. An 
example in this group is the Self-organizing-scouts method [5 1. 

In all previous work - diversity control, memory-based, and multi-population 
methods - the performance of different techniques may vary by the manner in 
which the environment changes [5|. Branke [5| attempted to classify different 
types of dynamics to gain an insight of the level of difficulties in dynamic opti- 
mization problems. A major research question here is what does make a dynamic 
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optimization problem hard to solve by evolutionary methods? Another equally im- 
portant question is whether by learning some decomposition of the problem, can 
it help in responding quickly to a change in the environment assuming that this 
decomposition is not affected by this change? 



2.2 Adaptive Automatic Decomposition 

One of the key challenges in the area of genetic and evolutionary algorithms is 
the systematic design of genetic operators with demonstrated scalability. Based on 
Holland's [34| notion of building blocks, Goldberg [2 35 36] proposed a design- 
decomposition theory for designing effective GAs. The theory establishes the iden- 
tification of suitable substructures or decompositions (also referred to as linkage) 
and ensuring efficient exchange of these substructures as a challenging task in de- 
signing competent GAs. The design-decomposition theory not only provides an 
insight into what makes a problem hard for GAs, but also has resulted in many 
competent GA designs. In essence, competent GAs successfully solve problems 
with bounded difficulties in a polynomial (sometimes sub-quadratic) number of 
function evaluations Q. A key element of competent GAs is a mechanism to 
automatically identify important substructures of the underlying search problem. 
Depending on the mechanism used to discover the problem decomposition, com- 
petent genetic algorithms can be classified into three broad categories: 

Perturbation techniques include the messy genetic algorithm l37l . fast messy 
genetic algorithm l38l fTTI . gene expression messy genetic algorithm [39], 
linkage identification by nonlinearity check genetic algorithm, and linkage 
identification by monotonicity detection genetic algorithm fT3l . and depen- 
dency structure matrix driven genetic algorithm [ 40 1, and linkage identifica- 
tion by limited probing ETI . 

Linkage adaptation techniques such as linkage learning GA I42l l43ll44ll . 

Probabilistic model building techniques [l4l|45][7) such as population-based in- 
cremental learning [8], the bivariate marginal distribution algorithm [46], 
the extended compact GA (ecGA) [ 10 1, iterated distribution estimation algo- 
rithm [47], Bayesian optimization algorithm (BOA) [48]. 
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A more detailed survey of various problem-decomposition mechanisms (or ge- 
netic linkage learning) are discussed elsewhere and the references therein [44]. 

Despite the success of competent GAs in solving stationary search problems, 
they have not been used to solve non-stationary problems apart from a preliminary 
study by 1 49 1 . The aim of this paper is twofold: first, to examine the performance 
of ecGA in terms of its response rate, as an example of a competent GA that au- 
tomatically decomposes and identifies substructures in non-stationary problems; 
and second, to test the method on problems with bounded difficulties. Our con- 
jecture is that by having a mechanism which focuses on identifying the important 
substructures (or building blocks) is beneficial for dynamic optimization problems 
as well. Furthermore, the problem-decomposition information serves as a way to 
store past information which could be used and manipulated to respond faster to 
changes in the environment. 



3 The Extended Compact Genetic Algorithm 

The extended compact genetic algorithm [ 10 1 is a probabilistic model build- 
ing genetic algorithm which replaces traditional variation operators of genetic and 
evolutionary algorithms by building a probabilistic model of promising solutions 
and sampling the model to generate new candidate solutions. Harik ITOl studied 
the problem of linkage learning and proposed a conjecture that linkage learning is 
equivalent to a good model that learns the structure underlying a set of genotypes. 
Being focused on probabilistic models, Harik focused on probabilistic models to 
learn linkage. In the ecGA method, he proposed the use of the minimum description 
length (MDL) principle [15] [161 H3 to compress good genotypes into partitions 
that include the shortest possible representations. The MDL measure is a tradeoff 
between two complexity measures. The first is a measure of information content 
in a population which Harik calls "compressed population complexity" while the 
second is a measure of the size of the model which Harik calls "model complexity". 

The compressed population complexity measure is a statistical complexity mea- 
sure based on the well-known information-theoretic approach of Shannon's en- 
tropy [50]. Shannon's entropy E{\i) of the population assumes that each partition 
of variables \i is a random variable with probability pj. The measure is given by 




(1) 
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where C is a constant related to the base chosen to express the logarithm and a is 
the number of all possible bit sequences for the variables belonging to partition xi\ 
that is, if the cardinality of \i is v i, a = 2 Ul . This measures the amount of disorder 
associated within a population under a decomposition scheme. Equivalently, it can 
be seen as the amount of information content presents in the population under 
a specific partition scheme. The compressed population complexity is a scaled 
version of the entropy as follows 

Compressed Population Complexity = N^^E(xi) (2) 

I 

The second complexity measure is associated with the model itself, which mea- 
sures the complexity of the model in terms of its size as follows: 

Model Complexity = log(7V + 1) (¥' - 1) (3) 

The MDL measure is the sum of the compressed population complexity and 
the model complexity as follows 

MDL = N Y,[- CX>k>g^ +logW (4) 
The ecGA method can be summarized in the following steps: 

1. Initialize the population at random with n individuals; 

2. Evaluate all individuals in the population; 

3. Use tournament selection without replacement to select n individuals; 

4. Use the MDL measure to recursively partition the variables until the measure 
increases; 

5. Use the partition to shuffle the building blocks (building block-wise crossover) 
to generate a new population of n individuals; 

6. If the termination condition is not satisfied, go to 2; otherwise stop. 

4 Methods 

In this section, we present two variations of the ecGA algorithm for dynamic 
environments. We assume in this paper that we have a mechanism to detect the 
change in the environment. Detecting a change in the environment can be done 
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in several ways including: (1) re-evaluating a number of previous solutions; and 
(2) monitoring statistical measures such as the average fitness of the population 
0. The focus of this paper is not, however, on how to detect a change in the 
environment; therefore, we assume that we can simply detect it. The modified 
ecGA algorithm for dynamic environments works as follows: 

1. Initialize the population at random with n individuals; 

2. If a change in the environment is being detected, do: 

(a) Re-initialize the population at random with n individuals; 

(b) Evaluate all individuals in the population; 

(c) Use tournament selection without replacement to select n individuals; 

(d) Use the last found partition to shuffle the building blocks (building 
block-wise crossover) to generate a new population of n individuals; 

3. Evaluate all individuals in the population; 

4. Use tournament selection without replacement to select n individuals; 

5. Use the MDL measure to recursively partition the variables until the measure 
increases; 

6. Use the partition to shuffle the building blocks (building block-wise crossover) 
to generate a new population of n individuals; 

7. If the termination condition is not satisfied, go to 2; otherwise stop. 

We will call the previous version dcGA(l). In this version, once a change 
is detected, a new population is generated at random, followed by selection and 
crossover using the last generated model. The method then continues with the new 
population. In the second version, dcGA(2), the last learnt model is not used to bias 
the re-start mechanism where the steps of selection and crossover that are carried 
out on the new randomly generated population are ignored. Both versions can be 
seen as a re-start approach, where the first instance uses the last learnt model after 
the re-start, while the second does not. In ecGA, the model is re-built from scratch 
in every generation. This has the advantage of recovering from possible problems 
that may exist from the use of a hill-climber in learning the model. 

Kargupta [ 1 1 1 has shown that problems with bounded complexity can be solved 
in a polynomial time "provided that there exists an appropriate measure that can 
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correctly detect the good relations". Miihlenbein [51 1 showed that order-k func- 
tions with length I are solvable in 0(l k log(Z)) using a (1 + 1)ES. Goldberg et. 
al. |38| achieved 0{l 2 ) complexity using the fast messy genetic algorithms. Pe- 
likan [14] provided a complexity of 0(n 165 ) using BOA. Sastry and Goldberg 
has shown that the convergence time for ecGA follows the relation derived by 
Miihlenbein and Voosen [ 52 1 for breeder GAs, where the convergence time is equal 

to * j\ where / is the selection intensity and / is the number of bits in the chro- 
mosome. 

In a changing environment, let us assume a chromosome with BB building 
blocks each of order k bits, I = k x BB. The ecGa will behave according to the 
previous complexity equation to build a correct decomposition model. If the envi- 
ronment does not affect the decomposition but only affects the peaks within build- 
ing blocks, a complete enumeration of all possible solutions within each building 
block would have a time complexity of @(m.2 k ) to get to the new optima. The 
notation represents lower and upper bound (tight) complexity. This is not very 
expensive. Assume a 5 bit building block replicated 100 times (a 500 bits prob- 
lem); the cost of tracking the optima when the decomposition does not change 
would be 2 5 x 100 = 3200 objective evaluations. This cost is less than what the 
experiments will show because the algorithm is designed to handle the general case 
that the decomposition may also change rather than the very specific case of fixed 
decomposition. 

We compare the results against a similar genetic algorithm except that the link- 
age learning based crossover operator in ecGA is replaced with a uniform crossover 
operator. We call this algorithm uGA to emphasize its use of uniform crossover 
with genetic algorithms. In the following section, we will present the experiments 
and the test functions used to test the proposed method. 



5 Experiments 

5.1 Test Functions 

A special class of problems that represent a challenge to GAs methods is known 
as "problems of bounded difficulty". These problems are characterized by two 
main features: they are additively decomposable and separable functions, and uni- 
formly scaled. A function f(X) = f(xo, . . . . . . ,x n ) is said to be additively 

decomposable and separable iff there exists a partition of < Xj >T=i sucn ^ at 
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Xj / <f>,j = l,...,m, XjClXk = <t>,j / k, and \Jf =1 Xj = X. Under this 
partition scheme, the function f(X) can be rewritten as 

m 

i=i 

The function is said to be uniformly scaled if all f(xj) we derived from the same 
class of functions. There are no assumptions on each /; each can be a multimodal 
function and can take any function form. Problems of bounded difficulty have been 
studied widely because they can provide an easy to analyze test functions which 
challenge the dynamics of simple genetic algorithms. We will define the order of 
difficulty for such a problem as k = maxj \xj\ « n, with | . | represents the cardi- 
nality of the set. Solving a problem with bounded difficulty becomes easy once the 
variables can be correctly separated into the right partitions; at which point, a com- 
plete enumeration of all possible solutions for each partition is sufficient to find the 
global optimal solution. Here we assume that the cardinality of each partition is 
small and is much smaller than the length of the solution vector. However, in the 
absence of the value of k and any knowledge of which variable belongs to which 
partition, the problem can be tough. Examples of problems with bounded difficul- 
ties include the Ising problem ll53ll54l . trap functions I55ll56ll57l . and functions 
which incorporate the notion of multimodality, hierarchy, crosstalk and deception 
ll2l . These test problems, despite being easy to understand, incorporates many of 
the essential difficulties for linkage identification. 



5.2 Experimental Design 

We repeated each experiment 30 times with different seeds. All results are 
presented for the average performance over the 30 runs. The population size is 
fixed to 5000 in all experiments. The population size is chosen large enough to 
provide enough samples for the probabilistic model to learn the structure and to 
provide enough diversity for uGA. Termination occurs when the algorithm reaches 
the maximum number of generations of 100. We assume that the environment 
changes between generations and the changes in the environment are assumed to 
be cyclic, where we tested two cycles of length 5 and 10 generations respectively. 
The crossover probability is 1, and the tournament size is set to 16 in all experi- 
ments based on Harik's default values. 
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5.3 Experiment 1 



The method is tested using dynamic versions of three trap functions. Trap 
functions were introduced by Ackley [ 55 1 and subsequently analyzed in details by 
others [56]|2]|^). A trap function is defined as follows 

_ J high if u = k 

trap k - | low _ u ^kn»_ otherwise W 

where, low and high are scalars, u is the number of Is in the string, and k is the 
order of the trap function. In this paper, we choose low = k, high = k + 1. 





Figure 1: The Trap functions in a changing environment. (Left) trap-3; (middle) 
trap^; (right) trap-5. 



In the initial set of experiments, we tested the method using traps of order 3, 
4, and 5. Figure ^ depicts a graphical representation of the traps and how they 
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change. In odd cycles, the global optimum is when all variables are l's, while in 
even cycles and at time 0, the global optimum is when all variables are 0's. 

We tested the methods with 5, 10, 15, and 20 building blocks. If we denote 
the number of building blocks by BB, then the optimal solution for each problem 
would be at BB * (k + 1). For example, with 20 building blocks in trap-5, the 
optimal solution has an objective value of 120 regardless of the change in the en- 
vironment. The environment in this first experiment does not actually change the 
value of the optimal solution but severely changes the value of the decision vari- 
ables. The change is severe as the optimal solutions isolates between two points 
separated with the maximum possible hamming distance in the hamming subspace 
defined by each trap. 

Figures|2lH and|4]present the performance of dcGA(l), dcGA(2) and uGA re- 
spectively. Starting with the performance of dcGA(l) as being depicted in Figured 
we can see that the algorithm consistently responds quickly to changes in the envi- 
ronment with trap-3 regardless of the number of building blocks, and cycle length. 
However, we can see that the response rate with trap^l is less as indicated with the 
drop in performance with cycle length 5 and the good performance with the longer 
cycle length of 10. From the figure, it can be seen that the higher the order of the 
trap, the slower the method is able to respond to a change in the environment. It 
can also be seen that the larger the number of copies of building blocks in the chro- 
mosome, the slower the response to environmental changes. The slowest response 
rate was encountered with trap-5 and 20 building blocks. These finding are logical 
as the level of hardness in the problem increases as the linkage and problem size 
increases. That is, the harder it is to separate the variables, the more difficult it is to 
learn the decomposition. This way, we can use the order of a trap and the problem 
size to quantify how hard a dynamic optimization problem is. 

Similar patterns exist with the use of dcGA(2) as being depicted in Figure |3] 
One can notice that the drop in performance is less with dcGA(l) than it is the case 
with dcGA(2). Also, by looking at trap-5 with cycle length 5, one can notice that 
the performance of dcGA(2) is worse than the corresponding case using dcGA(l). 
This is expected as the response rate would be higher when using dcGA(l) as 
compared to dcGA(2); thanks to the bias in the initial population with the last link- 
age model found. However, by comparing trap-5 with cycle length of 10 using 
dcGA(2) against the corresponding performance using dcGA(l), one can see that 
the performance of dcGA(2) is consistently better than the corresponding perfor- 
mance of dcGA(l). An explanation of this result will be presented in the following 
subsection. 
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Figure 2: Traps using dcGA(l) with last model (left) Cycle 5 (Right) Cycle 10. 
(Top) trap-3 (Middle) Trap^4 (Bottom) Trap-5. In each graph, the four curves 
correspond to 5, 10, 15, and 20 building blocks ordered from bottom up. 
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Figure 3: Traps using dcGA(2) without last model (left) Cycle 5 (Right) Cycle 10. 
(Top) trap-3 (Middle) Trap^4 (Bottom) Trap-5. In each graph, the four curves 
correspond to 5, 10, 15, and 20 building blocks ordered from bottom up. 



Comparing the previous results to the uGA results which are depicted in Fig- 
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Figure 4: Traps using uGA (left) Cycle 5 (Right) Cycle 10. (Top) trap-3 (Middle) 
Trap^t (Bottom) Trap-5. In each graph, the four curves correspond to 5, 10, 15, 
and 20 building blocks ordered from bottom up. 



ure|3l unexpectedly one can see that uGA is very competitive to the linkage learn- 
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ing model. After careful examination of the performance of uGA, we identified 
that the key reason behind the success of uGA is due to simple luck. As the two 
attractors in the problem exist when all solutions are O's and l'O, if uGA converges 
to the wrong attractor in one cycle, the wrong attractor becomes the right attractor 
in the following cycle and as it converges back to the wrong attractor, the envi- 
ronment changes again and switches the wrong attractor to become the preferred 
attractor. In other words, the environment changes in a manner that is beneficial 
for the bad performance of uGA. To verify our analysis, we conducted a second 
experiment as being explained in the following subsection. 



5.4 Experiment 2 

In the second type of experiments, we modified the trap function of order 4 to 
break the symmetry in the attractors. The new function is visualized in Figure |5] 
At time and in even cycles, the optimal solution is when all variables are set to 
O's and the second attractor is when the sum of l's is equal to 3. When the environ- 
ment changes during the odd cycles, the new solution is optimal when all variables 
set to 1 's with a new deceptive attractor when the sum of 1 's is 1 or alternatively 
the number of O's is 3. This setup guarantees that the trap is not symmetric with 
regards to its attractors. Some researchers suggested that a simple use of an Xor 
operator with trap functions would solve the problem easily because once the GA 
method converges to the wrong attractor, a simple Xor operator would take it to the 
right attractor. In our design in Figure |5l breaking the symmetry in the trap would 
also counterpart the possible trick of using an Xor operator. 

Figure |6] depicts the behavior of the three methods using the modified trap^t 
function. As expected, the uGA method clearly shows the worst behavior among 
the three methods. It is clear that it is unable to respond to the changes neither 
it is able to even get to the deceptive attractor in some cases. This behavior con- 
firms our analysis in the previous section. When looking at dcGA(l) and dcGA(2), 
however, we can see that dcGA(l) is better than dcGA(2). The dcGA(l) method is 
able to respond to the changes in the environment quickly, accurately, and reliably 
all the time. This result is somehow different as compared to the results obtained 
from the previous section. The linkage has not changed between the two setups, 
but the only change took place for the attractors. This suggests that the cause of the 
somehow inferior performance of dcGA(l) as compared to dcGA(2) is attributed 
to the crossover operator or mixing strategy that it was slow in reaching the two 
attractors with maximum hamming distance in the previous experiments. 
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Figure 5: The modified trap function 4 in a changing environment. 



5.5 Experiment 3 

In this experiment, we subjected the environment under a severe change from 
linkage point of view. Here, the linkage boundary changes as well as the attrac- 
tors. As being depicted in Figure the environment is switching between trap-3 
with all optima at l's and trap^ with all optima at O's. Moreover, in trap-3, a 
deceptive attractor exists when the number of l's is 1 while in trap^t, a deceptive 
attractor exists when the number of l's is 3. This setup is tricky in the sense that, 
if a hill climber gets trapped at the deceptive attractor for trap-4, the behavior will 
be good for trap-3. However, this hill-climber won't escape this attractor when 
the environment switches back to trap-4 since the solution will be surrounded with 
solutions of lower qualities. This setup tests also whether any of the methods is 
behaving similar to a hill-climber. 

Figure |8] shows the performance of dcGA(l), dcGA(2), and uGA. We varied 
the string length between 12 and 84 in a step of 12 so that the string length is di- 
vidable by 3 and 4 (the order of the trap). The following table lists the value of the 
optimal solution for each string length with trap-3 and trap-4. 
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Figure 6: Modified Trap 4 (left) Cycle 5 (Right) Cycle 10, (top) dcGA(l), (middle) 
dcGA(2), (Bottom) uGA. In each graph, the four curves correspond to 5, 10, 15, 
and 20 building blocks ordered from bottom up. 
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Figure 7: The switching trap function with k=3,4 in a changing environment. 



By scrutinizing Figure |8l one can see that dcGA(l) is faster in its response to 
the changes in the environment than dcGA(2). This can be recognized more with 
cycle length 5, where dcGA(2) fails to recover with string length 84. The perfor- 
mance of uGA was clearly inferior as it got stuck at the wrong attractor in the first 
cycle and it seems that it remained at this attractor struggling to jump out of it even 
with longer cycle length. 



5.6 Experiment 4 

In this section, we will test the method using the moving parabola as one of the 
standard functions for testing optimization in dynamic environments. In contrast 
to previous experiments, this function is a minimization problem. The function as 
presented in is 

n 
i=l 

Where, t is the time parameter, xi is decision variable i, and <5j(f) takes the follow- 
ing form: 

$(0) = 0, V*€{l,...,n} 

Si(t) =5i(t-l) + s, V»€{l,...,n} 

where s represents the severity of the changes and is taken to be 1 in this paper, 
which is a high sever change. We used 10 variables, and encoded each variable 
with ten bits scaled between ±40. The function is depicted in Figure[9]for a single 
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variable. 
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Figure 9: The moving parabola function in one dimension. 



FigureEI|depicts the performance of the three methods on the moving parabola 
function with cycle length of 5 and 10. It is very clear from the figure that the uGA 
is performing the worst and is actually diverging for sometime. This behavior is 
not surprising as because of the direction of the dynamics, the function seems to 
have come very close to an optimum then after the dynamics changed it was hard 
to track new optima for some iterations. If we look carefully at Figure |9l we can 
see that the trajectories of the movement is somehow creating a multimodal land- 
scape which seems to cause problems for uGA. One may think that the behavior of 
uGA is possibly attributed to loss of diversity. We found that this is not the case as 
evidenced by the behavior of uGA with cycle length 5. If diversity was lost, uGA 
would continue being unable to respond for the changes forever. However, we can 
see from Figure EH that uGA managed to recover at some point and continued to 
optimize the function for another few generations. 

Both dcGA(l) and dcGA(2) are consistently better. When having a closer look, 
one can notice that with cycle length 5, dcGA(l) is better than dcGA(2) as it gets 
closer to the minimum. With cycle 10, both methods track the movements well and 
get to the exact solution. 
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Figure 10: Moving Parabola (left) Cycle 5 (Right) Cycle 10, (top) dcGA(l), (mid- 
dle) dcGA(2), (Bottom) uGA. 
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6 Conclusion 



The results of this paper shed lights on the utility of learning possible structural 
decompositions in a changing environment. It is shown that the use of learning is 
more robust than simple GAs when the environment changes. The shifts between 
the two optima was radical to test the method under sever changes. In other words, 
if the changes in the environment are not worse than the changes we adopted in 
this paper, we can conclude that the proposed approach will respond quickly and 
accurately. 

However, the previous results left us puzzled with two main questions. First, 
where can we see problems with bounded complexity in real life problems? Link- 
age learning shows that we can build reliable models for solving these problems 
but can we map these lessons to real life applications to enhance problem solving. 
More recently, work [58]|59||60l|^ have been done to show that the lessons learnt 
from competent GAs and problems with bounded complexity are very useful for 
solving real life problems. We believe that more work will appear in the near future 
which will substantiate this phenomena as more researchers follow these lessons. 

The second question is whether other type of methods used for handling prob- 
lems in a changing environment will be superior to linkage learning when the 
changes in the environment are changes with bounded complexity as per the exam- 
ples used in this paper. As we said in the introduction, the three main directions for 
handling problems in a changing environment are memory, diversity, and specia- 
tion and niching. The uGA method adopted in this paper uses a large population 
with low selection pressure to maintain diversity in the population. 

With respect to methods based on memory and speciation, we will shed lights 
on their problems and the advantages of learning the problem structure. The learn- 
ing models do not depend on genes locations on the chromosome. To the contrary, 
these models learn the relationship between the genes. Let us assume a chromo- 
some with m building blocks, with k bits in each building block. Let us assume 
that each building block switches between two different attractors. Moreover, let 
us also assume that not all building blocks get affected each time; that is, when 
the environment changes, only a subset of the building blocks switch their peaks. 
Therefore, not all building blocks are at the same optima. This is not a problem 
for the proposed method, but obviously it is a major problem if we use memory or 
niching. First, let us look at the use of memory. The number of possible optima 
that the algorithm can alternate between would be 2 m . This is in effect the size of 
the memory needed to be able to respond correctly to the changes in the environ- 
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ment no matter what or where the changes occur under the previous setup. This 
indicates that an exponential memory is needed if we wish to respond effectively 
to the changes. The use of multi-population, speciation, or niching would suffer 
from the same drawbacks of the memory approach. The number of peaks can grow 
exponentially that it is hard to respond quickly to a change. 

One may wonder still if we need to store all 2 m optima in the memory to re- 
spond to changes. We will leave this for future work as it is still an open research 
question in the area of memory-based approaches to dynamic optimization prob- 
lems, where the problem is how to determine the optimal memory size needed 
to effectively respond to changes in the environment. In addition, it is possible to 
combine linkage learning and memory based methods. Overall, it can be seen from 
the previous discussion that linkage learning offers many opportunities to give new 
insights into dynamic optimization problems. 
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